<script>$(function(){initLeaderboardCaptions()});</script>
<h1>Captioning Leaderboard</h1>
<ul class="nav nav-pills ldbdTabs fontBigger">
  <li class="active"><a data-toggle="tab" id="a_cap_c5">Table-C5</a></li>
  <li><a data-toggle="tab" id="a_cap_c40">Table-C40</a></li>
  <li><a data-toggle="tab" id="a_cap_challenge2015">Challenge2015</a></li>
</ul>
<div class="tab-content" id="ldbdCaption">
  <table class="table order-column hover fontSmall ldbdData">
    <thead>
      <tr>
        <th></th>
        <th></th>
        <th>CIDEr-D</th>
        <th>METEOR</th>
        <th>Rouge-L</th>
        <th>BLEU-1</th>
        <th>BLEU-2</th>
        <th>BLEU-3</th>
        <th>BLEU-4</th>
        <th>SPICE</th>
        <th>date</th>
      </tr>
    </thead>
  </table>
  <h1>Metrics</h1>
  <table id="ldbdCaptionMetrics" class="table table-striped hover">
    <thead><tr>
      <th></th>
      <th></th>
    </tr></thead>
  </table>
  <p>
    For the details of data collection and evaluation, see <a href="http://arxiv.org/pdf/1504.00325.pdf" target="_blank">Microsoft COCO Captions: Data Collection and Evaluation Server</a>. Please also see the <a href="#captions-2015">challenge</a> and <a href="#guidelines">guidelines</a> pages for more information about the caption challenge and dataset splits.
    Note: leaderboard results are imported from <a href="https://competitions.codalab.org/competitions/3221" target="_blank">CodaLab Captioning Server</a> and may not contain the latest results.
  </p>
</div>
<div class="tab-content" id="ldbdCaptionChallenge">
  <table class="table order-column hover fontSmall ldbdData">
    <thead>
      <tr>
        <th></th>
        <th></th>
        <th>M1</th>
        <th>M2</th>
        <th>M3</th>
        <th>M4</th>
        <th>M5</th>
        <th>date</th>
      </tr>
    </thead>
  </table>
  <h1>Metrics</h1>
  <div>
    We conducted a human study to understand how satisfactory are the results obtained from the captions submitted in the COCO captioning challenge. We were interested in three main points:
    <br/><br/>
    <ul>
      <li> Which algorithm produces the best captions?</li>
      <li> What are the factors determining which is the best algorithm?</li>
      <li> Do the algorithms produce captions resembling human-generated sentences?</li>
    </ul>
    To address the above questions we developed five Graphical User Interfaces (GUI) on the Amazon Mechanical Turk (AMT) platform to collect human responses. From the responses we designed the following five metrics:
  </div>
  <table id="ldbdCaptionMetrics" class="table table-striped hover">
    <thead><tr>
      <th></th>
      <th></th>
    </tr></thead>
  </table>
  <h1>Ranking</h1>
  <p>The ranking for the competition was based on the results from M1 and M2. The other metrics have been used as diagnostic and interpretation of the results. Points are assigned to the top 5 teams for:</p>
  <table id="ldbdCaptionRank" class="table table-striped hover">
    <thead><tr>
      <th></th>
      <th>M1</th>
      <th>M2</th>
      <th>TOTAL</th>
      <th>Ranking</th>
    </tr></thead>
  </table>
</div>
